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ABSTRACT 

Real-time group editors allow a group of users to view and 
edit, the same document at the same time from geographi- 
cally dispersed sites connected by communication networks. 
Consistency maintenance is one of the most significant chal- 
lenges in the design and implementation of these types of 
systems. Research on real-time group editors in the past 
decade has invented an innovative technique for consistency 
maintenance, called operational transformation. This paper 
presents an integrative review of the evolution of operational 
transformation techniques, with the goals of identifying the 
major issues, algorithms, achievements, and remaining chal- 
lenges. In addition, this paper contributes a new optimized 
generic operational transformation control algorithm. 
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INTRODUCTION 

Real-time group editors allow a group of users to view and 
edit the same text/graphic/image/multimedia document at 
the same time from geographically dispersed sites connected 
by communication networks. These types of groupware sys- 
tems are not only very useful tools in the areas of CSCW [5], 
but also serve excellent vehicles for exploring a range of fun- 
damental and challenging issues facing the designers of real- 
time groupware systems in general. One such issue is consis- 
tency maintenance of shared documents under the constraints 
of short response time, and support, for free and concurrent 
editing in distributed environments [17]. 

Research on real-time group editors in the past decade 
has invented an innovative technique for consistency mainte- 
nance, under the name of operational transformation, which 
was pioneered by the GROVE (GRoup Outline Viewing Ed- 
itor) system in 19S9 [3]. Since then, several research groups 
have independently extended the operational transformation 
technique in their design and implementation of these types 
of systems. Major representatives in this area include the 
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REDUCE (REal-time Distributed Unconstrained Coopera- 
tive Editing) system [14, 15, 16, 17], the Jupiter system [11], 
and the adOPTed algorithm [13]. This paper will present an 
integrative review of the evolution of operational transforma- 
tion techniques, with the goals of identifying the major issues, 
algorithms, achievements, and remaining challenges. In ad- 
dition, this paper will contribute a new optimized generic 
operational transformation control algorithm. This paper 
will focus exclusively on transformation-based consistency 
maintenance algorithms. For discussion of alternative con- 
sistency maintenance techniques, such as turn-taking, lock- 
ing, serialization, and transactions, the reader is refereed 
to [5, 7, 8, 10, 17], 

The rest of this paper is organized as follows: First, some 
basic concepts and terminologies are introduced. Then, the 
operational transformation algorithm in the GROVE system 
is reviewed to see where the original work was started and 
what problems were left unsolved. Next, the problems with 
the original GROVE transformation algorithm are analyzed, 
and three different approaches to solving them are discussed 
one by one, including the REDUCE approach, the Jupiter 
approach, and the adOPTed approach. Furthermore, a new 
optimized generic operational transformation control algo- 
rithm is proposed. Finally, the paper is concluded with a 
summary of the major achievements so far and remaining 
challenges for future research. 

PRELIMINARIES 

In this section, some basic concepts and terminologies are in- 
troduced. Following Lamport [9], we define a causal (partial) 
ordering relation on operations in terms of their generation 
and execution sequences as follows. 

Definition 1: Causal ordering relation 
Given two operations O a and Ob, generated at sites i and 
j, then O a — > Ob, ijfi (1) i = j and the generation of O a 
happened before the generation of O b , or (2) i ^ j and the 
execution of O a at site j happened before the generation of 
Ob, or (3) there exists an operation O x , such that O a — J- O x 
and O x — J- O b . □ 

Definition 2: Dependent and independent operations 
Given any two operations O a and O b . (1) O b is dependent 
on O a iff O a -> Ob. (2) O a and O b are independent (or con- 
current), expressed as O a || O b , iff neither 0 o — J- O b , nor 
O b — b O a . □ 

To illustrate, consider a real-time group editing session 
with three sites, as shown in the time-space graph of Figure 1. 



siteO 


sitel 



Fig. 1. A scenario of a real-time group editing session. 

There are four editing operations in this scenario: operation 

0 1 generated at site 0, operations O2 and O3 generated at 
site 1, and operation O4 generated at site 2. It is assumed 
in this scenario that an operation is executed immediately at 
the local site, then propagated to remote sites and executed 
there upon their arrival. The arrows in the graph represent 
the propagation of operations from the local site to remote 
sites. Each vertical line in the graph represents the activities 
performed by the corresponding site. At site 1, for example, 

02 is executed first, followed by Oi, O3, and O4. 

According to Definitions 1 and 2, there are three pairs of 

dependent operations in this scenario: Oi — > O3, O2 — > O3, 
and O2 — > O4 because the execution of Oi happens before 
the generation of O3, the generation of 0 2 happens before 
the generation of O3, and the execution of O2 happens be- 
fore the generation of O4. Moreover, there are three pairs of 
independent operations in this scenario: 0 \ || 0 2 , Oi |j O4, 
and O3 || O4 because for any pair, neither operation’s exe- 
cution happens before the other operation’s generation. As 
will be seen in the following discussion, several fundamental 
inconsistency problems are embedded in this scenario. More- 
over, the seemingly simple independence relationship among 
operations in this scenario is actually quite intricate, and has 
given significant technical challenges to the design of correct 
operational transformation algorithms [17]. 

THE GROVE APPROACH 

To achieve good responsiveness and avoid a single-point of 
failure in the system, a replicated architecture has been 
adopted by GROVE: the shared documents are replicated at 
the local storage of each participating site. An (update) oper- 
ation is executed on the local replica of the shared document 
immediately after its generation, then broadcast to remote 
sites for execution (after some delay and transformation). 

Divergence and causality- violation problems 
Suppose remote operations are executed upon their arrival 
and in their original form, two inconsistency problems which 
may occur in a concurrent editing session have been identified 
in GROVE: one is divergence, and the other is causality- 
violation. 

For example, consider the scenario shown in Fig. 1. The 
four operations arrive and are executed in the following or- 
ders: Oi-, 0 2 , O4, and O3 at site 0; 0 2 , 0 1, O3, and O4 at site 
1; and O2, O4, O3, and 0 1 at site 2. If operations are not com- 


mutative, final editing results would not be identical among 
cooperating sites. This problem is called divergence. Clearly, 
the divergence problem should be prohibited for applications 
where the consistency of the final results is required. 

Moreover, since each cooperating site generates and broad- 
casts operations without synchronization, operations may ar- 
rive and be executed in an order different from their natural 
causal order. As shown in Fig. 1, operation O3 is generated 
after the arrival of Oi at site 1 , so O3 -> Oi . However, since 
O3 arrives before Oi at site 2, the execution of O3 before Oi 
may result in an undefined operation O3, which refers to a 
nonexistent context to be created by 0 \ , or a confused user 
at site 2, who observes the effect in O3 before observing the 
cause in Oi . This problem is called causality-violation. Out 
of causal order execution should be prohibited for applica- 
tions where a synchronized interaction among multiple users 
is required. 

Consistency correctness criteria 
Based on the identification of the two inconsistency problems, 
the GROVE consistency correctness criteria were defined by 
the following two properties: 

1. Convergence property: copies of the shared docu- 
ment are identical at all sites at quiescence (i.e., all gen- 
erated operations have been executed at all sites). 

2. Precedence property: if one operations O a causally 
precedes another operation Ob, then at each site the 
execution of O a happens before the execution of Of,. 

In search of a solution where the only constraint on execu- 
tion order is the causal ordering among operations, GROVE 
invented the late well-known distributed Operational Trans- 
formation (dOPT) algorithm. GROVE’s solution consists 
of two components: one is the state-vector timestamping 
scheme for ensuring the precedence property, and the other 
is the dOPT algorithm for ensuring the convergence prop- 
erty. The basic idea of the dOPT algorithm is that when 
an operation satisfies the precedence condition for execution, 
it is transformed against independent operations in the Log 
(which saves all executed operations in the order of their ex- 
ecution) in such a way that executions of the same set of 
properly transformed independent operations in different or- 
ders produce identical document states, thus ensuring the 
convergence property. 

A transformation property 

To ensure convergence, the dOPT algorithm requires the 
transformation function T to satisfy the following condition: 
For any two independent operations O a and Of,, suppose that 
0' a = T(O a , Ob), and 0’ b = T(Ob, O a ), it must be that 

O a 0 Ol, = Ob o 0' a 

where “=” means the two sequences of operations O a o 0 ’ b 
and Ob o 0' a are equivalent in the sense that when applied 
on the same input document state they produce the same 
output document state. 

In addition to the above formally specified condition, 
GROVE also recognized there were some circumstances, in 
which the transformation function should achieve an effect 
which is non-serializable. For example, suppose O a and Ob 
are two independent (character-wise) delete operations re- 
ferring to the same position, then T must ensure only one 
character is eventually deleted no matter in which order O a 
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and Ob are execut ed. This non-serializable effect is, however, 
not captured by the above formal condition for T. 

A sketch of the dOPT algorithm 
The transformation function T relies on the semantics of the 
editing operations and hence is application-dependent. The 
dOPT algorithm, however, is generic and takes care of select- 
ing operations for transformation and determining the trans- 
formation order. The basic control structure of the dOPT 
algorithm is simple: Given a causally ready operation O, the 
dOPT algorithm scans the Log to transform O against any 
operation in the Log which is independent of O ; then the 
transformed O, denoted as EO (i.e., the execution form of 
O), is executed and saved in the Log. The dOPT algorithm 
is sketched below. 

dOPTCO, Log) { 

EO = 0; 

for (i = 1; i <= n; i++) { 
if (LogEi] 1 1 0) 
then EO = T(E0, LogEi]); 

> 

Execute EO; 

Append EO at the end of the Log; 

> 

An unsolved dOPT puzzle 

In [3] (Fig. 4 in Section 6: Discussion of Correctness), one 
scenario was identified, where the dOPT algorithm could not 
ensure convergence. This scenario 1 is re-displayed in Fig. 2. 


site 3 site 1 site 2 



Fig. 2. The mixed priority example, in which the dOPT algorithm 
failed to ensure convergence. 

Suppose the GROVE transformation function uses the 
following priority rule: when two insert operations have 
the same position parameter, the position of the operation 
with a lower priority (i.e., smaller site identifier) will be 
shifted 2 . According to the generic dOPT algorithm and the 
application-dependent transformation function in [3], the op- 
eration transformation and the final document states at the 
three sites are as follows (assume the initial document is 
empty). 

l In fact, the scenarios in Fig. 2 can be obtained by removing O 2 from the 

scenario illustrated in Fig. 1. 

- It should be noted that this priority rule is actually opposite to the one 

used in the definition of transformation function T, ] in [3]. This change 
is necessary to correctly illustrate the problem the GROVE designers really 
intended to illustrate. 


At site 3, O3 first inserts “z” into the document 3 . When 

01 arrives, it inserts “x” in front of “z” to get a document 
with “xz”. Finally, when O 2 arrives, since O 2 || O3 and 

0 2 || Oi, it is first transformed against O3 and becomes 
0' 2 = Insert[y, 2] due to its lower priority than O3; then 
it is transformed against Oi and becomes 0 2 = 7nsert[y,3]. 
After the execution of 0 2 , the document contains “xzy” 4 . 
At site 1, the process of operation transformation and the 
final result are the same as that at site 3. At site 2, O 2 first 
inserts “y” into the document. When O3 arrives, it has to 
be transformed against O 2 since O3 || O 2 , but no change has 
been made to O3 due to its higher priority than O 2 . After 
the execution of O3, the document contains “zy”. Finally, 
when Oi arrives, it has to be transformed against O2 since 
Oi || O 2 . The transformation of 0\ against 0 2 will produce 
0[ = Insert[x, 2] due to its lower priority them 0 2 - 0[ does 
not need to be transformed against O3 since O3 — h Oi. After 
the execution of 0[, the document contains “zxy”, which is 
not identical to “xzy” at sites 3 and 1. 

The problem illustrated in Fig. 2 is fundamental to the 
correctness of operational transformation approach. As cor- 
rectly pointed out in [3], this problem could not be fixed by 
simply reversing the priority rule, since this patch works in 
this case but fails in other rather similar cases. In search of a 
correct solution to this problem, the simple-minded priority 
scheme (using a single site identifier) was thought to be root 
of the problem, thus a sophisticated (and complicated) prior- 
ity scheme (using a list of site identifiers) was proposed in [3]. 
This new priority scheme did not prove to be successful in 
solving the problem, thus leaving one unsolved puzzle to the 
groupware research community. 

The innovative idea of maintaining consistency by opera- 
tional transformation, as well as the unsolved dOPT puzzle, 
has been a major inspiration and stimulation to a number 
of research groups in the area of real-time groupware sys- 
tems. In fact, several research groups [1, 11, 13, 17], have 
independently re-discovered that the dOPT algorithm did 
not work whenever an operation is concurrent with two or 
more dependent operations, and different approaches have 
been proposed to fix it. In the following sections, three alter- 
native approaches will be discussed, including the REDUCE 
approach [14, 15, 17] using an 1-dimensional data structure 
for keeping track of executed operations, the Jupiter ap- 
proach [11] using a 2-dimensional data structure for main- 
taining executed operations, and the adOPTed approach [13] 
using a N-dimensional data structure (where N is number 
of cooperating sites in the system) for maintaining executed 
operations. 

THE REDUCE APPROACH 

REDUCE follows GROVE in adopting a fully distributed and 
replicated system architecture. A linear History Buffer (HB), 
which is the same as the Log in GROVE, is used to keep track 
of all executed operations. In addition, a garbage collection 
scheme was devised to remove useless operations from the 
HB [17], 

3 In this paper, the sequence of characters in a text document are referred 
to (or addressed) from 1 to the end of the document. 

4 It should be pointed out that this result is different from what was pre- 
sented in [3]. 
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The intention- violation problem 

Apart from divergence and causality- violation problems, one 
special kind of inconsistency problem - intention-violation - 
has been identified in REDUCE [14]. 

To illustrate, consider the two independent operations Oi 
and O 2 in the scenario shown in Fig. 1. At site 0, O 2 is 
executed on a document state which has been changed by 
the preceding execution of Oi. Therefore, the subsequent 
execution of O 2 may refer to an incorrect position in the new 
document state, and result in an editing effect different from 
the O 2 ' s intention , which is defined as the editing effect which 
could be achieved by applying O 2 on the document state from 
which O 2 was generated [14]. 

For example, assume the shared document initially con- 
tains the following sequence of characters: “ABODE”. Sup- 
pose Oi = 7nsert[“12”,2], which intends to insert string 
“12” at position 2, i.e., between “A” and “BCDE”; and 
O 2 = Delete[2,3], winch intends to delete the two charac- 
ters starting from position 3, i.e., “CD”. After the execution 
of these two operations, the intention-preserved result (at all 
sites) should be: “A12BE”. However, the actual result at site 
0, obtained by executing Oi followed by executing O 2 , would 
be: “AlCDE”, which clearly violates the intention of Oi since 
the character “2”, which was intended to be inserted, is miss- 
ing in the final text, and also violates the intention of O 2 since 
characters “CD”, which were intended to be deleted, are still 
present in the final text. A serialization protocol might be 
used to ensure that all sites execute Oi and 0 2 in the same 
order to get an identical result “AlCDE”, but this identical 
result is still inconsistent with the intentions of both 0 \ and 
O 2 . 

It is important to recognize that intention violation is an 
inconsistency problem of a different nature from the diver- 
gence problem. The essential difference between divergence 
and intention violation is that the former can always be re- 
solved by a serialization protocol, but the latter cannot be 
fixed by any serialization protocol if operations were always 
executed in their original forms. 

A consistency model 

Due to the distinction of the intention- violation problem from 
the divergence problem, one additional consistency correct- 
ness criteria - intention-preservation - was proposed in RE- 
DUCE [14]. The REDUCE correctness criteria for consis- 
tency maintenance has been defined in the form of a consis- 
tency model as follows. 

Definition 3: A consistency model 
A cooperative editing system is consistent if it always main- 
tains the following properties: 

1. Convergence: when the same set of operations have 
been executed at all sites, all copies of the shared docu- 
ment are identical. 

2. Causality-preservation: for any pair of operations 
O a and Ob. if O a -4 Ob, then O a is executed before Ob 
at all sites. 

3. Intention-preservation: for any operation O, the ef- 
fects of executing O at all sites are the same as the 
intention of O, and the effect of executing O does not 
change the effects of independent operations. 

□ 


To support the three properties of the consistency 
model, REDUCE adopted the same state-vector timestamp- 
ing scheme as that in GROVE for achieving causality- 
preservation (or precedence in GROVE’s terminology). With 
the distinction of intention-preservation from convergence, 
two separate schemes were devised for supporting these two 
different properties: an undo/do/redo scheme for achieving 
convergence, and an operational transformation algorithm for 
achieving intention-preservation. 

To achieve convergence, a total ordering relationship “=>” 
among operations is defined [14]. However, operations are al- 
lowed to be executed in any order as long as their causality is 
preserved. When a new operation O is causally-ready for ex- 
ecution, (1) undo operations in the HB which totally follow 
O to restore the document to the state before their execu- 
tion; (2) do O ; and finally (3) redo all operations that were 
undone from the HB. It should be noted that the undo/redo 
operations involved in this scheme are internal operations, 
rather than external operations initiated from the user in- 
terface [12]. Therefore, the undo/do/redo scheme should be 
implemented in such a way that only the final result (in- 
stead of the intermediate ones) produced at the end of the 
undo/do/redo process is reflected on the user interface. 

Transformation pre-/post-conditions 
Since transformation functions in REDUCE are not responsi- 
ble for ensuring convergence, they are not required to satisfy 
the same condition as in GROVE. In REDUCE, when oper- 
ation O a is transformed against operation Ob, it is required 
that the effect of the transformed operation 0 ' a on the doc- 
ument state that contains the impact of Ob should be the 
same as the effect of O a on the document state that does 
not contain the impact of Ob. This type of transformation 
is called Inclusion Transformation (IT),. since it transforms 
an operation O a against another operation Ob in such a way 
that the impact of Ob is effectively included. The GROVE 
transformation functions can be regarded as a kind of in- 
clusion transformation. Most importantly, it was recognized 
that the correctness of this inclusion transformation relies on 
the condition that both O a and Ob are defined on the same 
document state [17], so their parameters are comparable and 
can be used to derive a proper adjustment to O a ■ Failing 
to recognize and to ensure this condition is the root of the 
unsolved dOPT puzzle. 

In search of a correct and sophisticated solution to 
intention-preservation, REDUCE introduced another type of 
transformation, called Exclusion Transformation (ET), which 
transforms O a against another operation Ob in such a way 
that the impact of Ob is effectively excluded from O a [17]. 
For example, O4 and Oi are independent operations but gen- 
erated from different documents states, as shown in Fig. 1. 
When O 4 arrives at site 0, it is incorrect to simply transform 
O 4 against Oi. Instead, exclusion transformation should be 
applied on O4 against its causally preceding operation 0 2 to 
produce 0' 4 in such a way that O 2 s impact on O 4 is excluded. 
Consequently, O4 effectively shares the same document state 
with 0 1 , and then can be applied with the inclusion trans- 
formation against 0 \. 

To capture the required relationship between operations 
for correct transformation, the notion of operation context is 
introduced. The context of a document state is the sequence 
of operations executed on the initial document state to ar- 
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rive at the current document state. Given an operation O, 
the definition context of O, denoted as DC(0), is the con- 
text of the document state on ■which O is defined; and the 
execution context of O, denoted as EC(0), is the context of 
the document state on which O is to be executed. The inten- 
tion of an operation can be preserved if its definition context 
matches its execution context, i.e., DC(0) = EC(0). 

REDUCE uses two primitive transformation functions - 
IT(O a , Ob) and ET(O a , Ob) - to make an operation’s defini- 
tion context equivalent to its execution context. For specify- 
ing pre-/post-conditions of the transformation functions, two 
context-based relations are defined below (Note: a context is 
expressed as an operation list in the rest of the paper). 

Definition 4'- Context equivalent relation “ L) ” 

Given two operations O a and Ob, O a and Ob are context- 
equivalent, i.e., O a U Ob, iff DC (O a ) = DC(Ob). □ 

Definition 5: Context preceding relation “(- 4 ” 

Given two operations O a and Ob, O a is context preceding Ob, 
i.e., O a 1 - 4 - Ob, iff DC(Ob) = DC(O a ) + [OJ (where “+* 
expresses the concatenation of two lists). □ 

With the context-based relations, the pre-/post-conditions 
of the two transformation functions are specified as follows. 

Specification 1: IT(O a ,Ob) : 0' a 

1. Precondition for input parameters: O a U Ob. 

2 . Postcondition for output: O b 1-4 0' a , and the effect of 
0' a in DC(0' a ) is the same as the effect of O a in DC(O a )- 

□ 

Specification 2: ET(O a ,Ob ) : 0' a 

1. Precondition for input parameters: Ob H- O a . 

2. Postcondition for output: ObUO' a , and the effect of 0' a 
in DC(0' a ) is the same as the effect of O a in DC(O a ). 

□ 

The design of a pair of IT JET functions for string- wise 
operations, which satisfy’ the specified post-conditions, can 
be found in [16, 17], 

The GOT control algorithm 

To ensure transformation pre-conditions a Generic Opera- 
tional Transformation (GOT) control algorit hm has been de- 
vised [17]. Taking a causally-ready operation O and its execu- 
tion context EC(0) (i.e., the current contents of the HB) as 
input parameters, the GOT control algorithm uses the IT/ET 
functions to transform O into EO (the execution form of O) 
such that DC(EO) = EC(0). 

Three cases have been distinguished and handled dif- 
ferently in the GOT control algorithm, as illustrated in 
Fig. 3. In this example, we assume EC(0) = HB = 
[£Oi , £02 , £ 03 ]. 

Case 1 : All operations in EC(0) are causally preceding 
O. It must be that DC(0) = EC(0), so that EO = O 
(no transformation is performed). 

Case 2 : Operations causally preceding O are listed in 
EC(0) before operations independent of O. Since 
EOi -4 O, EO2 || O, and EO3 || O, by transforming 
O against EO2 and EO3 in sequence, we get EO such 
that DC(EO ) = £0(0). 


Inputs: 

O: a causally-ready operation 

O’s execution context: EC{0) =[E01, E02, E03] 

Output: O’s execution form EO 
Case 1. E01->0, E02->0, E03->0 

DC(O) = [ EOI, E02, E03 ] j) 

EC(O) = [EOI, E02, E03 ] EO 
Case 2. E01->0, E02 II O, E03 II O 




Fig. 3. Three cases analysis and handling by the GOT control algo- 
rithm 

Case 3 : At least one causally-preceding operation is posi- 
tioned after an independent operation in £0(0). This 
is the case that the dOPT algorithm failed to handle 
correctly. Since EO\ -4 O, EO2 || O, and £0 3 -4 O, it 
must be that DC(0) = [EOi,EO' 3 ], where E0' 3 is the 
original form of EO3 when O was generated. Transform- 
ing O directly against any operation in £0(0) would vi- 
olate the pre-conditions for IT/ET functions. The strat- 
egy taken by the GOT algorithm is as follows: (1) ap- 
ply exclusion transformation on £0 3 against EO2 (both 
EO3 and EO2 are available in £0(0)) to obtain EO'3, 
(2) apply exclusion transformation on O against EO3 to 
get an intermediate O'; and finally (3) apply inclusion 
transformation on O' against EO2 and EO3 in sequence, 
we get EO such that DC(EO) = £0(0). 

To describe the GOT algorithm, a few notations need 
to be introduced: Given a list of operations L, L[i,j ] de- 
notes a sublist of L containing the operations from EOi 
to EOj inclusively; and £ -1 denotes the reverse of L. 
LIT(0,L)/LI5T(0, L) is used to denote the application of 
IT/ET function on operation O against a list of operations 
in £ in sequence from left to right. 

Algorithm 1: G0T(0, L): EO 
O: a causally-ready operation 

L: the list of operations [E0i,E0 2 , ...,EO m ] in EC(0). 
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EO: the execution form of O. 

1. Scan L[ 1, m] from left to right to find the first operation 
EOk such that EOk ]| O. If no such an operation is 
found, then return EO := O. 

2. Otherwise, scan L[k + 1, m] to find operations causally 
preceding O. If no single such operation is found, then 
return EO := LIT(0 , L[k, m]). 

3. Otherwise, let L\ = [EO Cl ,—,EO Cr ] be the list of op- 
erations in L\k, m] which are causally preceding O. 

(a) Get L[ = [EO' Cl ,...,EO' Cr ] as follows: 

i. EO' Cl := LET(EO Cl , L[k, a - 1]" 1 ). 

ii. For 2 < i < r, 

O t := LET(EO c „L[k,a - l]- 1 ); 

EO' Ct := LIT(O t , [EO ' Cl , FOc,_J)- 

(b) O' := LET(0,L'~ l ). 

(c) return EO := LIT (O' , L[k, m]). 

□ 

It can be shown that the pre-conditions required by the 
transformation functions are always guaranteed by the GOT 
control algorithm. Therefore, if the post-conditions are 
always ensured by the transformation functions, then the 
GOT control algorithm will transform O into EO, so that 
the execution of EO on EC(0) will preserve the inten- 
tion of O. To achieves both intention-preservation and con- 
vergence, the GOT control algorithm has been integrated 
with the undo/do/redo scheme to form an undo/transform- 
do/transform-redo scheme [17], 

A solution to the dOPT puzzle 

In this section, we show how REDUCE solves the dOPT puz- 
zle. We assume, without losing generality, the total ordering 
relationship “=>” among the three operations in Fig. 2 is: 
Os =>■ Oi =4 O2. Also, we assume the the REDUCE transfor- 
mation function IT(O a , O&) uses the following shifting rule: 
if both O a and Ob are insertions and have the same position 
parameter, the position of O a will be shifted. This shifting 
rule is consistent with the priority rule used in GROVE. 

Under REDUCE, the operation transformation and the fi- 
nal document states (i.e., “xzy”) at sites 3 and 1 are the 
same as they are under GROVE. The situation at site 2, 
however, is different: O 2 first inserts “y” into the document. 
Next, when O3 arrives, O 2 has to be undone since O3 =4 O2. 
Then O3 is executed as is, and O 2 is inclusively transformed 
against O3 (according to O3 || O2 and Case 2 in the GOT 
algorithm) to become 0' 2 = Insert[y, 2] according to the 
shifting rule. After the execution of both O3 and 0' 2 , the 
document contains “zy”. Finally, when Oi arrives, O2 has 
to be undone since Oi =4- O2- Then 0\ is executed as is 
(since O3 — t 0 1), and O2 is inclusively transformed against 
Oi (according to O 2 ]| Oi and Case 2 in the GOT function) 
to become O" = Insert[y, 3]. After the execution of O", the 
document contains “xzy”, which is identical to the document 
state at. sites 3 and 1, and the intentions of all three opera- 
tions are preserved. In this particular example, the exclusion 
transformation is not used, but in more complex scenarios, 
such as the one shown in Fig. 1, exclusion transformation is 
needed (see [17]). 

THE JUPITER APPROACH 

The Jupiter collaboration system was developed at Xerox 
PARC [11]. Since Jupiter has already had a central server 
for maint ainin g the states of objects (e.g., White-board, text 


documents, etc.) in the shared persistent virtual world, it is 
natural to use this central server for supporting consistency 
maintenance of shared objects as well. The Jupiter consis- 
tency maintenance algorithm was derived from the dOPT al- 
gorithm. The most interesting part of the Jupiter approach 
is the adaptation of the dOPT optimistic algorithm to an 
environment with multiple replicated clients sites plus one 
centralized server site. 

In Jupiter, the shared documents are replicated at all co- 
operating client sites, which is the same as in GROVE. The 
difference is that the shared documents are also maintained 
at the central server and communications happen only be- 
tween a client and the server (i.e., a 2-way communication). 
When an updating operation is generated at a client site, it is 
immediately executed at the local client site (for fast response 
to user actions), and then propagated to the central server. 
The server first transforms the incoming operation if neces- 
sary, then executes the transformed operation on its copy of 
the shared document, and finally broadcasts the transformed 
operation to all other client sites. Upon receiving an oper- 
ation propagated from the central server, a client site may 
transform this operation if necessary, and then executes it on 
the local copy of the document. This star-like topology of 
communication eliminates the concern for ensuring causality 
(i.e., causality- violation never occurs). It also substantially 
simplifies the operational transformation control algorithm. 

To achieve convergence, the Jupiter transformation func- 
tion is required to satisfy the same property as that re- 
quired by the dOPT algorithm. However, Jupiter uses a 2- 
dimensional state space graph, instead of a linear Log/HB, 
to keep track of all possible operation transformation paths 
to guide the selection of right operations for transformation. 
The Jupiter algorithm ensures that any pair of operations 
involved in a transformation must have originated from the 
same starting state in the state space graph, which is es- 
sentially the same as ensuring the context equivalent pre- 
condition by the GOT algorithm in the REDUCE approach. 
Therefore, the Jupiter algorithm is able to correct the dOPT 
algorithm under the condition that only 2-way communica- 
tions are allowed in the system. An alternative approach to 
correcting the dOPT algorithm for the 2-way communication 
special case can be found in [1]. 

THE ADOPTED APPROACH 

The adOPTed algorithm adopted the same correctness cri- 
teria from GROVE for consistency maintenance: conver- 
gence and precedence (i.e., causality-preservation). It also 
followed GROVE in using a fully distributed and replicated 
architecture. What is different in the adOPTed algorithm 
is that it requires an additional property for transformation 
functions to satisfy. Given two operations O a and Ob, let 
0' a = T(0 o, 06), and 0’ b = T(Ob,O a ), the transformation 
function T is required to possess the following two proper- 
ties: 

Transformation Property 1 (TP1) : 

O a o 0' b = Ob o 0' a 

Transformation Property 2 (TP2) : For any O, 

T(T(0, Oa), O'b) = T(T(0, O b ), O'a) 
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TPl is the same as that required by the dOPT algorithm 
and the Jupiter algorithm, but TP2 is new in the adOPTed 
algorithm. TP2 ensures that the transformation of opera- 
tion O along different paths will yield the same resulting op- 
eration. These two properties can be illustrated by using a 
directed graph, called interaction model [13], as shown in Fig- 
ure 4. 
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sures that any pair of operations involved in a transformation 
are defined on the same document state. 


Path taken by site 3 and 1 



(a) Transformation property 1: : Oa o Ob’ — Ob o Oa’ 



(b) Transformation property 2: T( T(0, Oa), Ob’) = T( T(0,0b), Oa') 

Fig. 4. Interaction model illustration of transformation properties. 

The vertices of the interaction model graph are labeled 
by document states, and the edges are labeled by opera- 
tions. For example, the four vertices of the square in Fig- 
ure 4-(a) are labeled by four document states: So, Si, S 2 , 
and S3, respectively; the two solid edges are labeled by two 
original operations: O a and Ob, respectively; and the other 
two dashed edges are labeled by two transformed operations: 
0' a = T(O a , Ob), and 0 ' b = T(Ob, O a ), respectively. Essen- 
tially, TPl ensures the unique vertices labeling, whereas TP2 
ensures the unique edge labeling in the interaction model 
graph. It has been shown in [13] that TPl and TP2 are the 
necessary and sufficient conditions for ensuring convergence 
in systems which allow N-way communication (where N is 
the number of cooperating sites). 

The adOPTed algorithm used an A- dimens ional interac- 
tion model graph to keeps track of all valid paths of opera- 
tion transformations. The A-dimensional interaction model 
graph can be viewed as a generalization of the 2-dimensional 
state space in the Jupiter algorithm, and it also plays the 
same role in guiding the selection of the right path and right 
operations for transformation. The adOPTed algorithm en- 


Fig. 5. The adOPTed solution to the dOPT puzzle 

Using the adOPTed algorithm and the same transforma- 
tion function Th from GROVE, the solution to the dOPT 
puzzle can be illustrated in Figure 5. At sites 3 and 1, the 
operation transformation and execution follow the same path: 
O 3 and Oi are executed as is, but O 2 is transformed against 
O3 and Oi in sequence, resulting in 0 2 = /ns[y, 2], then 
O 2 = /ns[y,3] (In the meantime, the adOPTed algorithm 
also produces 0' 3 = Irts[z, 1], and 0 [ = Ins[x, 1], which are 
of no use at sites 3 and 1). The execution of O3, 0\, and O" 
in sequence results in the final document state “xzy” . At site 
2, a different path in the interaction model graph is taken. 
First, O2 is executed as is. When O3 arrives, it is trans- 
formed against O2 to become 0 3 = Ins[z, 1]. Meanwhile, the 
adOPTed algorithm also transforms O2 against O3 to pro- 
duce 0 2 = Ins[y, 2], and both 0 3 and 0' 2 are maintained 
at proper positions in the interaction model graph. When 
Oi arrives, the adOPTed algorithm searches the interaction 
model graph to find the right operation 0 2 (instead of O 2 , 
which was used in the dOPT algorithm) for transformation 
to get O'i = Jns[a:, 1], In the meantime, the adOPTed algo- 
rithm also produces and maintains 0 2 — Ins[y, 3] at site 2 
(0 2 is of no use in this example). The execution of O2, O3, 
and 0 [ in sequence results in an identical document state 
“xzy”. 

AN OPTIMIZED ALGORITHM: GOTO 
Without requiring TPl and TP2, the GOT control algorithm, 
integrated with the undo/do/redo scheme [17], is the only 
known solution for achieving both intention-preservation and 
convergence. An interesting question is: what could the GOT 
algorithm achieve if TPl and TP2 are satisfied by IT/ET 
functions? In this section, we will answer this question and 
propose a new optimized GOT control algorithm. 

To take advantage of the two additional post-conditions 
TPl and TP2, we modify the original context-based rela- 
tions in Definitions 4 and 5 as follows: replace the equal sign 
“=” with the equivalence sign “=”. Obviously, the equal re- 
lation between operation contexts is a special case of the 
equivalence relation “=” . With this generalization of context- 
based relations and the extension of pre-/post-conditions for 
IT/ET functions, we found that the original GOT control 
algorithm can ensure both intention-preservation and con- 
vergence, without integrating with the undo/do/redo scheme 
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or using a multi-dimensional graph. The verification of this 
claim can follow similar reasonings as used in [13], which is, 
however, beyond the scope of this paper. 

Moreover, the two additional post-conditions TPl and 
TP2 can be employed to optimize the GOT control algorithm 
by reducing the number of IT/ET transformations. The op- 
timized algorithm, named as GOTO (GOT Optimized), re- 
sembles the original GOT algorithm in handling the first and 
the second cases (see Fig. 3). For the third case, the handling 
is different. In addition to performing transformations on the 
definition context of O, we also perform transformations on 
the execution context of O to make the two contexts equiv- 
alent. This can be achieved by executing the following two 
steps: 

1. Transform execution context EC(0 ) into such an equiv- 
alent EC{0)' that all operations causally preceding 
O are positioned before independent operations in 
EC(0)'. Let EC(0)' = EC{0)'.left+ EC(0)'. right, 
where EC {O)' .left is the sublist of causally preceding 
operations, and EC(0)' .right is the sublist of indepen- 
dent operations. 

2. Apply the inclusion transformation on O against 
the list of independent operations in EC(0)' .right. 
The transformation pre-condition is satisfied because 
EC(Oy.left=DC(0). 

The question now is: how to transform EC(0) into such 
an equivalent EC(0)"! 

By using IT and ET functions, the Transpose function is 
defined to transform and swap two operations in an execu- 
tion context. 

Function 1 : Transpos(O a ,O b ) : 0 b ,0' a 

{ 

0’ b :=ET(0 b ,0 a y, 

0’ a := IT(O a ,0' b y, 
return ( 0 ’ t , 0 ' a ); 

} 

The pre-condition for O a and O b is: O a >-r O b . The 
post-condition for 0' a and 0' b is: 0' h H- 0' a . Based on 
the Transpose function, function LTranspose(L ) is defined, 
which transforms and circularly shifts the list of operations 
in L. 

Procedure 1 : LTranspose(L) 

{ 

for (i = |£j; i > 1; i- -) 

(L[i — 1], L[j)) := Transpose(L[i — 1], L[{\); 

} 

According to TPl and TP2, and the definition of Trans- 
pose, it must be that L = L' , where L' is the list of operations 
after calling LTranspose(L). 

As an example, the han dlin g of case 3 by the GOTO algo- 
rithm is shown in Fig. 6. In this example, we can transpose 
E0 2 and EOz in EC(0 ) by calling Transpose(EC> 2 , EO 3 ), 
so that, an equivalent execution context EC(0)' --- 

[EOx, EO 3 , EO’ 2 ] can be obtained. Then, since DC(0) = 
[EOi , EOj], we can apply an inclusion transformation on O 


Case 3. E01->0, E02 II O, E03->0 




DC(0) = [EOI, E03’] o 


Fig. 6. The handling of mixed independent and dependent operations 
by the GOTO control algorithm 

against E0 2 to get EO, such that DC{EO) = EC(0)'. To 
transform O into EO in this example, three IT/ET transfor- 
mations (one Transpose function costs one IT and one ET 
transformations) are needed under the GOTO control algo- 
rithm, whereas four IT /ET transformations are needed under 
the GOT control algorithm. 

Algorithm 2: G0T0(0, L): EO 
O: a caus ally-ready operation 

L: the list of operations [E0i,E0 2 EO m ] in EC{0). 

EO: the execution form of O. 

1. Scan L[l, m] from left to right to find the first operation 
EOh such that EOk |[ O. If no such an operation is 
found, then return EO O. 

2. Otherwise, scan L[k -f 1, m] to find operations causally 
preceding O. If no single such operation is found, then 
return EO LIT{0, L[k,m]). 

3. Otherwise, let L\ = [EO Cl , ...,EO Cr \ be the list of op- 
erations in L[k, m] which are causally preceding O. 

(a) For 1 < i < r: 

LTranspose(L[k + 1 — 1, c,]); 

(b) return EO := LIT(0, L[k + r, to]). 

□ 

It can be shown that the pre-conditions required by 
the transformation functions are always guaranteed by the 
GOTO control algorithm. Therefore, if the post-conditions, 
including TPl and TP2, are always ensured by the trans- 
formation functions, then the GOTO control algorithm will 
transform O into EO, so that the execution of EO on EC(0) 
will preserve the intention of O and ensure convergence. 

CONCLUSIONS AND FUTURE DIRECTIONS 

Many people have experiences of using various editors. Not 
so many people have recognized that there would exist many 
interesting research issues in an editor when used in a real- 
time collaborative context. Even less people have come to 
learn that some research issues in real-time group editors, 
such as consistency maintenance, would be so challenging 
that a decade exploration would not be enough to exhaust 
their research potential. In this paper, we have reviewed a 
number of major operational transformation algorithms for 
consistency maintenance in real-time group editors, including 
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the dOPT algorithm, the GOT algorithm, the Jupiter algo- 
rithm, and the adOPTed algorithm, and have proposed a new 
optimized transformation control algorithm - the GOTO al- 
gorithm. In this concluding section, we s umm arize the major 
achievements in the past decade on the transformation-based 
consistency maintenance techniques and point out the major 
open issues for further exploration. 

Major achievements 

Three inconsistency problems - divergence, causality- 
violation, and intention- violation - have been identified and 
explored. Particularly, the non-serializable intention viola- 
tion problem has been distinguished from the serializable 
divergence problem. Corresponding to these three prob- 
lems, consistency correctness criteria consist of three prop- 
erties: convergence, causality-preservation, and intention- 
preservation. It is useful to integrate these three properties 
in a consistency model, which effectively specifies what con- 
sistency has been promised to the system users and what 
properties must be supported by the underlying system algo- 
rithms. 

The discovery of the necessary transformation pre- 
conditions has been a significant step toward the design 
of correct transformation control algorithms. The notion 
of operation context is very useful in capturing the re- 
quired relationship between operations for correct transfor- 
mation. Alternative approaches to ensuring transformation 
pre-conditions include the GOT/GOTO control algorithms 
working on an 1-dimensional history' buffer, the Jupiter algo- 
rithm working on a 2-dimensional state space graph, and the 
adOPTed algorithm working on a N-dimensional interaction 
model graph. 

Two types of transformation functions have been proposed: 
inclusion and exclusion transformations. For algorithms that 
use a multi-dimensional data structure to keep track of oper- 
ations in their original, intermediate, and executed forms, 
such as the Jupiter and adOPTed algorithms, only inclu- 
sion transformation is needed. For algorithms that use an 
1-dimensional history buffer to save operations in their ex- 
ecuted form only, such as the GOT and GOTO algorithms, 
apart from inclusion transformation, exclusion transforma- 
tion Is needed to recover operations’ original and intermedi- 
ate forms from their executed forms. 

The identification of proper trans- 
formation post-conditions has played a crucial role in the 
design of both the generic transformation control algorit hms 
and application dependent transformation functions. By re- 
quiring context-based post-conditions, the GOT control algo- 
rithm can achieve intention-preservation. The context-based 
post-conditions, however, do not capture the conditions for 
ensuring convergence, so the GOT control algorithm must be 
integrated with an undo/do/redo scheme to achieve conver- 
gence. In essence, undo/redo can also be viewed as a kind 
of transformation, which is performed directly on the doc- 
ument states rather than on the operations. By requiring 
TPl only, the Jupiter algorithm can achieve convergence in 
systems which are restricted to 2-way co mmuni cation. By re- 
quiring both TPl and TP2, the adOPTed algorithm achieves 
convergence in systems which allow N-way co mmuni cation. 
Neither TPl nor TP2, however, captures the conditions for 
ensuring intention-preservation, so intention-preservation has 
been implicitly handled by transformation functions in the 


dOPT algorithm, the Jupiter algorithm, and the adOPTed 
algorithm. By requiring both TPl and TP2, in addition 
to the context-based post-conditions, the GOT control al- 
gorithm alone is able to achieve both intention-preservation 
and convergence. By performing transformations on both 
definition and execution contexts, the GOTO algorithm is 
able to optimize the GOT algorithm by reducing the number 
of transformations. 

Open issues and future directions 

The correctness of the whole operational transformation 
scheme relies on the satisfaction of both transformation pre- 
conditions and post-conditions. Lots of work have been done 
on the design of correct generic transformation control al- 
gorithms to ensure transformation pre-conditions. However, 
not much work has been done on the design of application- 
dependent transformation functions which could really ensure 
transformation post-conditions [16]. We have learned that 
TPl and TP2 have to be satisfied by transformation functions 
in order to ensure convergence, but we know little about how 
to verify whether an existing transformation function really 
satisfies TPl and TP2. In fact, as illustrated in [17], some 
seemingly correct transformation functions do not really sat- 
isfy TPl and TP2. More serious attention should be given to 
the design of transformation functions to better understand 
the intrinsic interactions (in the form of pre-/post-conditions) 
between transformation functions and transformation control 
algorithms. 

Research should also be directed toward formed specifica- 
tion and verification of operational transformation concepts, 
properties, and algorithms. This formalization and verifica- 
tion is necessary for rigorously proving the correctness of the 
algorithms and for analyzing and improving the time and 
space complexities of existing algorithms. In [1], a Calcu- 
lus for Concurrent Update (CCU) has been derived from the 
dOPT algorithm as a tool for the purpose of formal mod- 
eling and verification of consistency-preserving operational 
transformation. The Team Automata [6] is another mathe- 
matical model for describing the interaction of a groupware 
system components. More work needs to be done in devel- 
oping and applying innovative theoretical tools to verify op- 
erational transformation algorithms and systems. 

Future research should distinguish and explore two types of 
consistencies: one is syntactic consistency, which is concerned 
with whether all sites have the same view of the shared ob- 
jects, regardless of whether the common view makes sense 
in the application context; and the other is semantic consis- 
tency, which is concerned with whether all sites have the same 
view of the shared objects, as well as whether the common 
view makes sense in the application context. There may exist 
many levels of syntactic consistency and semantic consistency 
in a particular application context. Previous work has mainly 
explored issues related to syntactic consistency. Particularly, 
the term intention as defined in [14, 17] and used in this paper 
has captured only a small piece of the much richer meaning of 
intention from the human user’s perspective. This brings up 
interesting areas of research concerned with characterization 
and preservation of the human user’s intentions in collabora- 
tive contexts, or group intentions. It may be infeasible for the 
system alone to automatically determine the human group in- 
tentions for different groups with divergent group goals. The 
system, however, could and should have mechanisms to help 
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the group users decide their group intentions and resolve their 
conflicts. In general, we advocate a groupware system design 
paradigm, which builds a sufficient amount of generic sup- 
porting mechanism into the system, but leave the high level 
collaboration policy decisions up to the system users. A good 
groupware system should be easily tunable by its users for 
supporting various collaboration needs [2, 10]. 

A lot of efforts have been putted on achieving the short- 
est response time (as short as single user editors), but not 
much research has been done on notification policy - when 
and how to make local updates public to achieve global con- 
sistency. Alternatives to notifying remote sites immediately 
after executing an operation at the local site include periodic 
notification, notification on demand, greying out the screen 
to tell user that the displayed information is out-of-date, etc. 
Future research should be conducted on mechanisms for sup- 
porting alternative notification policies and their applicabil- 
ity in different application environments. 

Operation granularity is another unexplored issue. Cur- 
rent- transformation algorithms are only capable of handling 
fine-grain primitive operations, such as Insert and Delete. 
Useful editors, however, must offer to the end user higher 
level compound operations, such as Move, and Replace. On 
one hand, the system needs additional mechanisms to support 
coarse-grain compound operations as an atomic sequence of 
primitive operations while still ensuring consistency proper- 
ties. The richer semantics of the compound operations, on 
the other hand, could help the system to better understand 
and preserve the user’s intentions. 

A number of prototype group editors have been built in 
the past by various research groups for testing the feasibility 
of transformation-based consistency maintenance algorithms, 
and for investigating system design and implementation is- 
sues. GROVE has been used in several real groups for a 
variety of design activities to evaluate the system from users 
perspective and to gain usage experience [4, 5]. Since then, 
however, little has been reported on using this type of sys- 
tem in real-life collaborative environments to study the user’s 
working modes in using the system, and to conduct statistics 
analysis of conflicts. Much more research efforts should be 
directed toward better understanding the potential effects of 
this type of system on people, their work and interactions. 

Although all the transformation-based consistency main- 
tenance algorithms and functions were designed in the con- 
text of text- editing, many of them are actually quite general 
and potentially applicable in other domains of group edit- 
ing. It would be interesting and useful to apply operational 
transformation in graphics/image/multimedia editors to fur- 
ther validate the generic algorithms and to gain more in- 
sights in the design and application of these types of systems. 
Even techniques used in transforming a sequence of charac- 
ters could potentially be applicable in other real-time group- 
ware systems, which allow concurrent insertion/deletion of 
any sequence of objects with a linearly ordered relationship. 
Moreover, operational transformation has been found very 
useful in supporting user-initiated collaborative undo opera- 
tions [12, 13], 

Consistency maintenance is a fundamental issue in many 
areas of computing systems, including operating systems, 
databases systems, distributed shared memory systems, 
and groupware systems. Research on real-time group ed- 
itors, as a special class of distributed systems support- 


ing human-computer-human interactions, has drawn inspira- 
tions from traditional distributed computing techniques (e.g., 
causal/total ordering of events, state- vector timestamping, 
serialization, etc.), and has also invented the non-traditional 
operational transformation technique to address its special is- 
sues, such as intention-preservation. The generalization and 
application of this unique operational transformation tech- 
nique to other areas of distributed computing and CSCW is 
an exciting direction for future exploration. 
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